Solution: Create an Intersection Table
Let's devise a solution for Jaywalking by creating an intersection table.
Instead of storing the account_id
in the Products
table, we can store it in a separate table, so each individual value of that attribute occupies a separate row. This new table Contacts
implements a many-to-many relationship between Products
and Accounts
:
When the table has foreign keys referencing two tables, it’s called an intersection table. This implements a many-to-many relationship between the two referenced tables. That is, each product may be associated through the intersection table to multiple accounts, and likewise, each account may be associated with multiple products. See the Intersection table Entity-Relationship Diagram below.
Let’s see how using an intersection table resolves all the problems we saw in the previous lesson.
Querying products by account and the other way around#
In order to query the attributes of all products for a given account, it’s more straightforward to join the Products
table with the Contacts
table.
Let’s run the code in the playground below. You can also update the code written below.
Some people resist queries that contain a JOIN
, thinking that they perform poorly. However, this query uses indexes much better than the solution shown earlier in the previous lesson.
Querying account details is likewise easy to read and easy to optimize. It uses indexes for JOIN
efficiently, instead of an esoteric use of “regular expressions.”
Let’s run the query below to see the effect of this query on the database.
Note: The records for
account_id
having the values12
,23
, and34
are already available in the database. If we want to query some more data, we can add the data by using the standard syntax.
The same is the case with the Products
table. The records for 123, 345, and 567 are already available. If we want to query some more data, we can add the data by using the standard syntax.
Making aggregate queries#
Let’s run the given query in the following widget. We can also try to use some other aggregate queries that use functions like COUNT()
, SUM()
, and AVG()
.
The following example returns the number of accounts per product:
The number of products per account is just as simple:
Note: Try to run the query given above. You can edit the code and see the effect on the database.
Other more sophisticated reports are possible too, such as the product with the greatest number of accounts:
Updating contacts for a specific product#
We can add or remove entries in the list by inserting or deleting rows in the
intersection table. Each product reference is stored in a separate row in the
Contacts
table, so we can add or remove them one at a time.
Note: If we want to insert data other than what is written in the playground, we must first check the available data by querying the specific table.
Now, let’s try to delete the same data that we added in the previous playground.
Validating product IDs#
We can use a foreign key to validate the entries against a set of legitimate
values in another table. We declare that Contacts.account_id
references
Accounts.account_id
, and therefore rely on the database to enforce referential integrity. Now we can be sure that the intersection table contains only
account IDs that exist.
We can also use SQL data types to restrict entries. For example, if the entries in the list should be valid INTEGER
or DATE
values and we declare the column using those data types, we can be sure that all entries are legal values of that type (not nonsense entries like “banana”).
Let’s add “banana” instead of “456” product_id
and see what happens.
Choosing a separator character#
We don’t use any separator character since we store each entry on a separate row. In this way, there’s no ambiguity even if the entries themselves contain commas or other characters that may be used as separators.
List length limitations#
Since each entry is in a separate row in the intersection table, the list is limited only by the number of rows that can physically exist in one table. If it’s appropriate to limit the number of entries, we should enforce the policy in our application using the count of entries rather than the collective length of the list.
Other advantages of the intersection table#
Creating an index on Contacts.account_id
makes performance better than matching a substring in a comma-separated list. Declaring a foreign key on a column implicitly creates an index on that column in many database brands.
We can also create additional attributes for each entry by adding columns to the intersection table. For example, we can record the date a contact was added for a given product, or we can specify an attribute to differentiate primary contacts from secondary contacts. We can’t do this in a comma-separated list.